Skip to content

Conversation

@mmabrouk
Copy link
Member

This commit introduces a new guide on running evaluations using the Agenta SDK. It provides an overview of the process, enhancing user understanding of programmatic evaluation execution.

…r feedback and Langchain observability. Adjusted paths in existing documentation to reflect new file structure.
This commit introduces a new guide on managing testsets, including creating, listing, retrieving, and upserting testsets using the Agenta SDK. Additionally, a Jupyter notebook is added to demonstrate these functionalities with practical examples.
…entation and examples related to evaluation processes in Agenta.
This commit introduces a comprehensive guide on creating custom evaluators and utilizing built-in evaluators to assess application outputs. The new documentation covers the structure, inputs, return values, and practical examples for both custom and built-in evaluators, enhancing the overall understanding of evaluation processes in Agenta.
This commit introduces a new guide on defining and configuring applications for evaluation with the Agenta SDK. It covers the basic application structure, input handling, return values, and provides practical examples, enhancing user understanding of application setup and usage.
This commit introduces a new guide on running evaluations using the Agenta SDK. It provides an overview of the process, enhancing user understanding of programmatic evaluation execution.
@linear
Copy link

linear bot commented Nov 12, 2025

@vercel
Copy link

vercel bot commented Nov 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
agenta-documentation Ready Ready Preview Comment Nov 12, 2025 1:54pm

@mmabrouk
Copy link
Member Author

#2932

… API key setup and refining the evaluation process steps. This update improves user experience by ensuring necessary credentials are configured for LLM-based evaluators and clarifies the overall evaluation workflow.
…-judge evaluator and refine testset description. This enhances clarity for users configuring evaluations with the Agenta SDK.
…arameters in the Agenta SDK. This update improves clarity for users running evaluations and ensures proper configuration of evaluation details.
…ameter and updating example outputs. This enhances clarity and consistency in the Jupyter notebook and markdown files related to testset creation.
… Agenta SDK

This commit introduces two new guides: one for managing testsets, detailing creation, listing, and retrieval processes, and another for configuring evaluators, covering both custom and built-in evaluators. These additions enhance user understanding of evaluation workflows and improve the overall documentation structure by replacing the previous running evaluations guide.
@mmabrouk mmabrouk changed the title AGE-3418-docs-for-new-evaluation-sdk [Docs] AGE-3418 Docs for evaluation SDK Nov 12, 2025
… structure. This change enhances readability and maintains focus on the evaluation creation process using the Agenta SDK.
@mmabrouk mmabrouk marked this pull request as ready for review November 12, 2025 13:54
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Nov 12, 2025
@mmabrouk mmabrouk requested a review from junaway November 12, 2025 13:54
@mmabrouk mmabrouk changed the base branch from main to release/v0.62.2 November 12, 2025 13:55
@mmabrouk mmabrouk changed the base branch from release/v0.62.2 to main November 12, 2025 13:55
@mmabrouk mmabrouk enabled auto-merge November 12, 2025 13:55
@mmabrouk mmabrouk merged commit bc4cf0f into main Nov 12, 2025
11 checks passed
@dosubot dosubot bot added documentation Improvements or additions to documentation Evaluation labels Nov 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation Evaluation size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants